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Abstract—Ethereum is undergoing significant changes to its 
architecture as it evolves. These changes include its switch to 
PoS consensus and the introduction of significant infrastructural 
changes that do not require a change to the core protocol, 
but that fundamentally affect the way users interact with the 
network. These changes represent an evolution toward a more 
modular architecture, in which there exists new exogenous 
vectors for centralization. This paper builds on previous studies 
of decentralization of Ethereum to reflect these recent significant 
changes, and Ethereum’s new modular paradigm. 

Index Terms—blockchain, Ethereum, decentralization, cryp- 
tocurrency, cryptoeconomics 


I. INTRODUCTION 


The contribution of this paper is to propose a model 
for measuring decentralization that accommodates structural 
changes in wider network topology over time. As web3 and 


cryptocurrencies are a relatively nascent socio-technological 
innovation, they are in a phase of initial rapid innovation, 
in which the architecture and topology of the networks are 
evolving significantly. This is a well understood phenomenon 


in technological innovation, which was documented as early 
as the 1960s (1). in which the innovation and adoption of 


new technologies develop in an “S-Curve” shape, involving 
compressed stages of very rapid innovation followed by a 
period where innovation plateaus for a time. Ethereum is an 


example of a technology that is in the rapid innovation phase, 
in which there are significant changes to the topology of the 
overall network, both intrinsic and extrinsic to the core proto- 
col. Whereas previous research focused on measuring 
decentralization at the various layers of a vertical stack within 


a monolithic system, our model views Bthereum as a socio- 
technological ecosystem in which significant components of 
the network develop outside the core protocol. 

The paper is organized as follows: in section II we deliver 
an overview of how Ethereum is evolving and the challenges 
faced when attempting to measure its level of decentralization. 
In section III we outline the various dimensions that we 
propose to measure with our model, and our data sources. In 
section IV we describe our methodology, including the various 
indices that are applied to our data. In section V we deliver 
a breakdown of results, and we close with our conclusions in 
section VI. 


II. BACKGROUND 


It can be argued that protocols that are built on top of the 
base layer of Ethereum do not pose a direct threat to the 


base layer itself, even when they are highly centralized, and 
should therefore not be a factor in quantifying the network’s 
level of decentralization. Once the base layer is sufficiently 
decentralized, any number of protocol designs can be imple- 
mented on top of it, and ideally the base layer should not be 
aware of them, or be adversely affected by them. However, as 


Ethereum evolves, users increasingly interact with the network 
through abstracted layers of infrastructure that overlay the 


core protocol, and as such it can be conversely argued that 


performance of the overall network under certain conditions, 
and should therefore be considered within a holistic model of 


measurement of the networks’ level of decentralization. Our 
criteria for inclusion within our model is that the component 
being measured does not just serve a single use case or 
application, but is a protocol through which users interact with 
a substantial number of other dapps and protocols. 

Any infrastructure that assumes a significant role in 
Ethereum can pose a centralization risk to the overall network 
based on two critical factors: 


a. 


as measured in either the amount of base layer transac- 
tions that flow through the infrastructure and/or the Total 
Value Locked (TVL) compared to the base layer. 


b. 


whether this effect is a level of effective degraded perfor- 
mance of the network, or an increased level of censorship. 


Ethereum is not a static ecosystem, and other innovations 
will likely assume a prominent role within the ecosystem in 


while still being able to track 
the changes of effective decentralization over time. 


II. SELECTION OF DATA POINTS 
A. Overview of Data Model 


We use as a base for our model the measurement of decen- 
tralization in blockchain first described by Balaji Srinivasan as 


the Minimum Nakamoto coefficient [7]. This model considers 
a blockchain network as being composed of a number of 


subsystems, which are important in terms of maintaining 


decentralization within an ecosystem, allowing it to remain 


resistant to capture by any one party or group. Srinivasan 
describes any blockchain as being only as decentralized as the 
least decentralized subsystem, and his original model loosely 


defines a number of discrete subsystems to measure. 

We have adapted Srinivasan’s model to the contemporary 
PoS Ethereum topology and introduced several other dimen- 
sions that represent exogenous vectors for potential central- 
ization. Our model thus extends Srinivasan’s original model 
from 6 dimensions to 12. These dimensions of measurement 
are listed below, and are followed by a detailed explanation 
of the rationale for each. 


e Based on original Nakamoto Coefficient subsystems: 


e Metrics pertaining to PBS: 


e Metrics pertaining to Account Abstraction: 


e Miscellaneous Metrics: 


B. Metrics based on the Nakamoto Coefficient Subsystems 


We have adapted the Nakamoto Coefficient model through a 
number of modifications to the original model. These changes 
include removing Mining Decentralization and Developer De- 
centralization. 

The “Mining Decentralization” metric is no longer relevant 
in PoS Ethereum and as such has been replaced by the 
“Amount staked by pool” metric, which measures the relevant 
share of the staked ETH by staking service provider. 

The “Developer Decentralization” metric is no longer an 
applicable metric for PoS Ethereum. The rationale for this 
change is the fact that nodes on the network run a number 
of different client implementations, each with its own distinct 
development team. In this context, and considering a priori that 
developers are unique to each team, it is sufficient to measure 
the level of client diversity among nodes on the network rather 
than the relative contributions of individual developers. 

The “Exchanges by Supply” metric is not employed in our 
model as there is not a strong enough argument as to relevance 
of this metric to decentralization in Ethereum. 

In terms of client diversity, it is also necessary to update the 
model to reflect the fact that Ethereum is now technically two 
merged blockchains that operate in unison, the Beacon Chain 


which handles consensus, and the execution layer, which is the 
P2P layer that gossips transactions and handles execution. For 
this reason, the original “Client Decentralization”, and “Node 
Decentralization” metrics have been replaced by “Consensus 
/ Execution nodes by client / country” metrics. 

The only metric that has been retained in its original form 
is “Distribution of native asset by amount’, which measures 
wealth inequality in terms of ownership of ETH. 


C. Metrics pertaining to Proposer Builder Separation 


Our model introduces two new metrics that pertain to Pro- 
poser Builder Separation (PBS), which are “Blocks proposed 
by builder” and “Blocks proposed by relay” respectively. 

PBS is a network topology that has not been implemented 
at the protocol level, but has been implemented via the mev- 
boost middleware developed by Flashbots (8). and which came 
online at the time of Ethereum’s switch to PoS. 

Fundamentally, PBS allows for the separation of concerns 
between block building and block proposing (9). whereas 
currently the protocol assigns the responsibility of both to 
the validator. Ethereum’s PoS protocol requires validators to 
broadcast a valid block of transactions to the network when 
they are selected as a proposer. As per the specification, 
validators will build a block locally by requesting their local 
execution client to collate pending transactions from the public 
P2P network. However, validators can install mev-boost and 
can request blocks from third party specialist block builders 
via public relays, instead of building one themselves (10). 

This has several benefits, from lowering the resource re- 
quirements for running a validator node, to reducing central- 
izing economics of MEV in staking pools [11]. However, 
it also introduces a number of other actors into Ethereum’s 
infrastructure topology, i.e. block builders and relays, creating 
new vectors for potential centralization. 
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Fig. 1. High level mev-boost architecture 


Our model applies a weighting to the PBS metrics when 


considering the measurement in the context of Ethereum’s 


overall level of decentralization. This is because any byzantine 


(1). 


actions according to specific criteria. This effectively results in 


those transactions experiencing a potentially significant delay 
in being included in a block, (about 68% longer than regular 
transactions according to Yang et al. ue ). 

This can effectively create a with transac- 
tions associated with certain addresses becoming “less privi- 


leged” than other transactions. Ironically the more transactions 


— as block builders will need to bid higher than the 


combined value of those transactions in order to have their 
blocks proposed, resulting in an effective per-block fee for 
censoring transactions (15). However, the higher the level of 
centralization within the block builder / relay infrastructure, 
the greater the risk for censorship, creating increased barriers 
to participation in the network for affected users. 


D. Metrics pertaining to Account Abstraction 


Our model introduces two metrics that pertain specifically 
to account abstraction, including “Number of user operations 
per bundler” and “Number of wallets per deployer”. Account 
Abstraction has been a goal of Ethereum since its inception, 
and there have been a number of previous proposals that were 
not implemented {16} {17|{18], which all involved some change 
to the core protocol. The breakthrough came with “ERC-4337: 
Account Abstraction Using Alt Mempool” [19], which does 
not require a protocol change, but which introduces new roles 
within the ecosystem topology: bundlers and paymasters. 

ERC-4337 specifies a specific transaction type called a 
user operation, or “userop”. User operations are submitted 
to bundlers, who batch them into a single transaction to a 
global entrypoint contract, which iterates over the userops 
in the batch, passing them to their respective smart contract 
wallets along with the userop’s calldata for the contract wallet 
to execute (e.g. send ETH or call a function on some specific 
smart contract). 
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Fig. 2. ERC-4337 Account Abstraction Architecture 


this class of actors is lower than other parts of the infrastruc- 
ture we have included. Although bundlers can choose to censor 


specific transactions, the censored sender can simply decide to 
send their transaction directly to the entry-point contract, or to 
their smart contract wallet directly, if its design allows. This 
represents a relatively weak form of censorship, but it does 
require some user sophistication in order to bypass. Over- 
centralization in this part of Ethereum’s infrastructure can 
potentially lead to censorship risks, resulting in an effective 
two-tier network, with some addresses being less privileged 
than others in terms of their access to the network. We posit 
that this is a reasonable basis for including this metric in our 
model, albeit with an adjusted weighting. 


E. Layer 2 Rollups by Relative TVL 


Our model introduces a metric to measure L2 Rollups by 
TVL relative to the TVL of the base layer. As Ethereum 
progresses through its “rollup-centric roadmap” (20), the TVL 
of L2 rollups as proportionate to the overall network becomes 
more significant within the composition of the ecosystem. 

Our model applies a weighting to this metric, taking into 
account the extent of the risk that centralization within these 


protocols pose, i.e. 


Consider as an example an L2 rollup with a centralized 
sequencer that experiences a significant liveness fault, in which 
users with funds on that network are no longer able to transact 
as they would under normal conditions. In this scenario, it may 
be possible that the users can force a withdrawal through the 
L2’s base layer smart contract bridge. However, if this rollup 
contains a substantial number of user accounts, it may result 
in significant congestion on the base layer (Zi). This would 
likely cause an increase in base fee and a prolonged delay 
in transaction inclusion. Furthermore, in the case of tokens 
that are minted natively on an L2, it may not be possible to 
withdraw them to the base layer at all. 

There is also a theoretical risk to Ethereum’s underlying 
social consensus from having a single dominant rollup, which 
is described by Buterin (22), as forming a broad assumption 
within the ecosystem that ”if there is a bug that causes funds 
to be stolen, the losses will be so large that the community 
will have no choice but to fork to recover the users’ funds”. 


F Miscellaneous Metrics 


As part of our model we measure the “effective inflation 
rate adjusted for burn”. This is an important metric for any 


PoS base layer protocol, including Ethereum. 


which et Renee nee eNO a 


ETET Polynya describes a number of 


examples of this phenomenon that have been observed in 


practice [23]. 


Our model thus incorporates a simple metric for measuring reference one or the other based on the respective qualities of 
the inflation rate and adjusting it for the amount of ETH eac 
that is burned through the EIP-1559 mechanism, in which the eee, 
base fee, which is adjusted for every block by the protocol erence any indicative result from our base measurement. This 
itself, and that each transaction must pay at a minimum 
in order to be included in a block, is subsequently burned 
when a block is proposed. This has the effect of creating 


a negative issuance rate once transaction volume surpasses updating the index parameters to make it more sensitive to 
critical threshold, which will likely decrease the total supply changes at different extremes of the distribution. This can be 


over time (24). useful if we want to qualify our results with more detail on 

We have also incorporated a closely related metric which any part of a distribution at certain times. 
is the “percentage of total supply staked”. This is directly 
related to the effective inflation rate metric with respect to parameterization of the Atkinson Index. These include the 
maintaining an economic equilibrium between the issuance P90:P10, and P50:P10 ratios, as well as the Palma ratio. These 
rate and circulating supply, allowing the asset to hold its ratios 


value over time [25]. It is worth pointing out that Ethereum’s percentiles, for example, the P90 represents the level of 
i ) resources allocated to > 90% of the population, while the 
ducing issuance as more validators come online 26], which P10 represents the level of resources allocated to > 10% of 


theoretically reduces the incentive to stake once the percentage the population. Our model adjusts the parameterization of the 
of staked assets reaches a certain threshold. However, there is Atkinson index with respect The to the values of these tail 


always the possibility that innovations such as EigenLayer may ratios. 
disrupt this equilibrium over time. 


to the 


is useful because the Shannon Index is based on a different 


approach to the Gini Index and the HHI. The Atkinson Index 


Another meaningful metric that we have introduced into 
our model is the measurement of “Stablecoins by relative 
TVL on Ethereum”. There exists both algorithmic stablecoins, such as 30, 60 and 90 day intervals. 
(which are backed by a number of other assets and which Our model also incorporates a master index in order to track 
rely on networks of decentralized oracles, and which are at 
least notionally decentralized by nature), and there also exists This master index is an aggregate of other rele- 
stablecoins which are backed largely by fiat deposits, and vant indices, that is 
which are issued by a centralized authority. The latter type 
of stablecoin is relevant to our model, insofar as while these 
stablecoins are not part of the infrastructure of the network, The indices that we have employed in our model are as 
they are a significant part of the ecosystem with which users listed below and are described in detail in the following sec- 
interact with other dapps. They also pose a centralization risk tions. Each index has characteristics and trade-offs associated 
in terms of censorship as there have been a number of cases with the underlying approach or model that they are based on. 
where stablecoins have been frozen from specific addresses 


[27]. Underlying Approach Index 
Deviations model Gini index 
IV. METHODOLOGY Combinatorics model Herfindahl-Hirschman index 
Entropy model Shannon index 
The measurement of inequality in distributions is a well Social welfare model Atkinson index 
understood area of statistics that has found many applications Tail ratios Palma ratio, Pareto ratio 
in the fields of economics and social sciences. As our model Divergence measures Jensen-Shannon Divergence 


aims to measure the level of decentralization across a number 
of different dimensions with different qualities, it is necessary 
to incorporate a number of different statistical measurements. 


We use as a base measurement, the Gini index and the ^ Gin! Index 
Herfindahl-Hirschman index. The Gini index is arguably the The Gini Index was developed by Corado Gini [28] as 
most widely used index with regards to measuring wealth a mechanism for 


inequality, and is well suited for measuring the distribution in a population, and is arguably the most commonly used 
of a network’s native asset, (i.e. ETH), while the Herfindahl- measurement of inequality across a number of fields. It is 
Hirschman index (HHI) is more often used for me ployed as the 
level of competition in specific industrial sectors, making it and has been used in several previous studies of 
more suitable for measuring the degree of decentralization decentralization in Bitcoin and/or Ethereum 
within the block builder market. The base measurement is [36], which allows for some comparison with 


applied to each measurement dimension, but our results will the results of previous studies. 


Weighted geometric mean | Master Index 


The Gini Index is derived from the Lorenz curve (37). which 
allows us to plot the individual shares of the distribution in 
relation to the overall total distribution. This becomes partic- 

within a distribution 
at a high level, and can also be very useful for comparing 
inequality between two distributions easily. 
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Fig. 3. Visualization of Lorenz Curve 


There are a number of different methods of calculating the 


Gini Index (Tutberidze et al. describe four [38]), though most 
methods are based on the meme SS 


While the area under a curve is commonly calculated using 
the Newton-Leibnitz formula , where L(x) is the Lorenz 
curve. It can also be approximated as the sum of the areas of 


a series of trapeziums, correlating in width to the unit interval 
being measured. 


1 
c=1-2 f L (a) dx 
0 


While the calculation of the Gini Index using the Lorenz 


Curve is useful for visualizing the distribution and level of 
inequality on a chart, it is also possible to ravines au sey 
‘using an approach that is based on 

[40], which is the approach 


that we have implemented: 
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G = 


The Gini index gives us a value ef between 0 and 1, where 


0 indicates perfect equality of distribution of resources within 
the population, and 1 is total inequality, i.e. one single entity 


controls 100% of the resources. 


B. Herfindahl-Hirschman Index 


The Herfindahl-Hirschman Index is commonly used as a 
sector. It 
has found applications in regulation, particularly with antitrust 


authorities [41]. It is calculated as the sum of the square of 
hh 


As the HHI is based on percentage shares of the market, it 


becomes close to zero for a market that has been commodi- 


equal share. Conversely, the HHI — i 
highly concentrated market, 


with 10,000, (or 1 x 100?), being 


Our model adapts the standard HHI by re-scaling it to 


by dividing the HHI by 10* so that it falls in the interval 

As such, the re-scaled HHI, denoted by 0, 
is expressed via the following formula, where n is the total 
number of participants in the market, P is the total number of 
units produced in the entire market, and p; is the number of 


units consumed from i” participant. 


*. (pj /P - 100)? 
g= 57 Pil : ) 


As an example, in measuring concentration in the block 
building market, P would be the total number of blocks 
produced in an interval, p; would be the blocks proposed to 
the network that are built by the it” block builder. 

The US DoJ generally classifies markets within three dis- 


crete categories [41| 7 


Unconcentrated HHI < 1,500 
Moderately Concentrated | 1,500 <= HHI <= 2,500 
Highly Concentrated HHI > 2,500 


Because the Herfindahl-Hirschman Index is commonly used 
for identifying and measuring the presence of monopolies in 
industry, it is more 


that measure the infrastructure that is provided as a service 
or public good by a relatively small number of actors, as 


opposed to measuring the distribution of ownership or control 
of some asset. This makes it particularly useful for measuring 
middleware such as block builders or relays, or ERC-4337 
bundlers. 


C. Shannon Index 


The Shannon Index is part of a family of measurements 
that are derived from information theory and which are based 
on the concept of entropy. Other measurements in this category 
include the Generalized Entropy measurement, and the Theil 
indices. As the Shannon Index is based on a very different 


approach from ei ini i it is a useful 
meas : 
The Shannon index is a measure of the amount of entropy 


in a dataset. It was intended to be used to measure the amount 


of information content in a signal, where information content 


be conside surement of the unexpectedness of 
a ng at a specific point in the signal, 
the average level of unexpectedness within a 


and entropy is 


Shannon gives examples of strings of characters, where 
the more characters and randomness there is, the lower the 
probability of predicting the next character in the string, and 
the more unexpected it is when that value occurs as predicted. 
The less often the next character can be predicted correctly, 
the higher the entropy. 

A basic example of this concept is a coin toss, where one 
party chooses heads, and where there is a 50% chance of the 
expected value (i.e. heads) occurring as predicted. As this has a 
relatively high probability, p(heads) = 0.5, coin tosses have 
low entropy. If we roll a die, the entropy increases as the 
probability of an expected value decreases, e.g. p(6) = 0.16. 

Entropy as applied within the field of economics was 
systematized by Theil [B], and has found applications in mea- 
suring inequality within a distribution of resources, and later 
within several studies of decentralization in cryptocurrencies 
BJ. 

The Shannon index is commonly expressed using the fol- 
lowing formula, where N; is the frequency of a each value in 
the dataset divided by N, the total number of distinct values 


in the dataset, i.e. N; is the number of entities within the 


The Shannon index was designed for use with categorical 


data, as opposed to continuous data, for which the GE index 
is better suited [44]. However, for the purposes of measuring 
distribution of native asset (i.e. ETH), 
Shannon index to ranges of amounts of ETH. In this context, 
pi is the proportion of the asset owned by the it” percentile 
of the population. 


The Shannon index has a range between 0 and the logarithm 
of the number of categories in the dataset, 


ie. 0 <= H’ <= 


log(n). 
Shannon index will be to zero (32). 


D. Atkinson Index 


The Gini Index is useful as a base for calculating inequality, 
but it has limitations in describing the qualities of inequality. 
As the Gini Index is based on the ratio of total areas under 
the curve, it and 


it also means that two different distributions can potentially 
For this reason, our model employs the Atkinson index 
[45] as a further measure of decentralization, to allow us to 


cross-reference our Gini index against a measurement that 
can be fine-tuned to our requirements, and which can be 


used to capture any nuance in the distribution of different 
measurements. 
The Atkinson index is 


where the parameters include: 


€ | inequality aversion parameter where e > 0 
number of people in the i” income group 
N | total number of people 

yi | average income of the i” income group 
u | average income of the total population 


Ni 


The Atkinson index is very closely linked to the generalized 
entropy index, and other related entropy based indices, such 
as the Theil index (i.e. GE(1)). The Atkinson index results in 
a value between 0 and 1, and 


end of the distribution. The value of e€ is referred to as the 
inequality aversion, and the index yields a higher value when 
€ is given a value closer to 1. 
E. Tail Ratios 

According to Atkinson [45], the 


is affected by changes at tail ends of the distribution. In order 


to account for this characteristic of the Gini index, we employ 
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Our model incorporates 


the 
concludes that 


distribution to the lower 40% of the distribution. Palma 


l 


including the P90:P10, P50:P10 ratios. 
Without loss of generality, Px represents the level of resources 
allocated to greater than x% of the population, and is used 
to 


These RE Ee nocmolized JSD is EE 


interpolation method [47] described below, where P, denotes 
the desired percentile, and N is the population size, and v 
represents the value at position ¿ in an ascending ordered 
dataset, where 7 is calculated using the following formula: 


_ P,(N +1) ê 


100 
This gives us the rank position of the desired percentile, 
assuming the dataset is sorted in ascending order. We then 
interpolate between the value at this rank position in the 
ordered dataset, and the value at the subsequent position, in 
order to attain each percentile in our desired inter-decile ratio: 


PS (vil —vuj) (i mod 1) + vii] , i mod 1 #0 
P=v;, imod1=0 


By themselves these ratios are not useful measurements as 
the range within the results are not bounded, as with the Gini 
index or HHI. They can be useful to reference when deciding 


F. Jensen-Shannon Divergence 


As we are interested in measuring the changes to levels of 
decentralization over time, we employ a method described by 


Too el lat epee 
to similarity or divergence between 


two probability distributions. The JSD is useful because it 
has a defined upper bound, which means it can be scaled for 
comparison with other index values. In order to scale the JSD, 


we normalize it by dividing the result by the upper bound of 
the possible range of results, which in our model, is logs. 

We calculate the JSD between two intervals by comparing 
the distribution of each one across each measurement dimen- 
sion respectively. As such, 


The normalized JSD is then calculated as follows: 


Dys(p\lq) = 


1 1 1 
=D =D 
r (FP ex(rlim) + $Diev(alim)) 


where m 3(p + q) is the average of p and q, and 
Drx(p||m) and Dgz(ql||m) are the Kullback-Leibler diver- 
gences of p and q from m respectively, defined as: 


p(t) 
D l 
xz(plla) = Yl og PS T 


It is important to note that both vectors are the same size, 
therefore 
size of q, or if m < n, we pad q with zeros until it matches 


the size of p. 
are ordered 


dimension for both the 1 day, 30, 60 and 90 day intervals in 
order to track and quantify any changes in decentralization in 
specific subsystems over time. 


G. Deriving a Master Index 


As we are measuring Ethereum’s level of decentralizati 
over time, we need to account for its nee 
which significant components of its infrastructure will change. 
For example, we might measure the concentration in the mev- 
boost relay market, but later decide to remove this metric as 
innovations like enshrined PBS make them redundant [49]. 

Our model calculates an aggregate set of indices across a 
number of dimensions of measurement, that can be used as 
further indicator of the relative level of decentralization in 


Ethereum as it changes over time. Each relevant index, (i.e. 


i.e.: 
n a 
= (TT;-1 (6: x wi) - 100)” — min (8) 
V= (max (8) — min (8))- 10-? 
where 6 = {metric;, metrics, ..., metric, } i.e. the set of 


relevant metrics, w is the respective weighting for each metric, 
and n = |8|, the number of metrics being measured. 
This master index can be used to 


The relevant metrics that are included in each aggregate 
index, along with their respective weightings, are listed below. 
The weightings are assigned based on the qualitative properties 
of the infrastructural component being measured. The actual 
weightings used in the formula are the percentage of each 
weighting relative to the sum of all weightings. 


Metric Weight 
Consensus nodes by client 1 
Consensus nodes by country 1 
Execution nodes by client 1 
Execution nodes by country 1 
1 
1 


Distribution of ETH by amount 
Amount staked by Pool 


Blocks proposed by builder 0.7 
Blocks proposed by relay 0.7 
Number of userops per bundler 0.2 
Number of wallets per deployer 0.2 
Layer 2 rollups by relative TVL 0.5 
Stablecoins by relative TVL 0.3 


This results in a series of aggregate indices for each interval 
within the time-range our sample data is taken from. The 
aggregate indices relate to the Gini, Atkinson, Normalized 
HHI and Normalized Shannon indices. This should allow us to 


us to examine 


V. RESULTS 


In this section we discuss the results of the application 
of our model to a sample that was recorded at 24 hour 
intervals across a period of 90 days, between the 23" May 
2023 to the 23"! of August 2023. We discuss how the results 


1S 


A. Discussion of Results 


As can be seen from the table below, 


highlights the need to 


and also highlights the fact that 


Metric 


Gini 
HHI 
\| Atkinson 
Shannon 


Execution Nodes by Country 
Execution Nodes by Client 
Consensus Nodes by Country 
Consensus Nodes by Client 
Amount Staked by Pool 
Native Assets by Address 
Blocks by Builder 

Blocks by Relays 

User Operations by Bundler 
Wallets by Deployer 
Stablecoins by Tvl 

Rollups by Tvl 


Fig. 4. 90 day averages across all metrics 


The values in the table in figure |4| are color coded for 
legibility, with 


of concentration, and colors closer to green indicating higher 
levels of decentralization. We observe that the index values for 
several metrics diverge significantly, particularly between the 
Gini index and HHI. In most cases this is expected, since the 
HHI is more focused on measuring concentration at the upper 
end of a distribution. eet caning loss of generality, a 
distribution with a large number entities that have a low share 
of resources. In this distribution the Gini index will increase 
significantly, while the HHI will remain largely unaffected. If 
the same distribution contains a small number of entities that 
control nearly all the resources but that each have an equal 
share, the HHI will be relatively low, but will increase as the 
share between these controlling entities becomes less equal. 


We observe this phenomenon clearly in several metrics in 
the results table, for example in the Amount Staked by Pool 
metric, where 2 entities control 60% of the market [50] (Lido 
controls 31% while solo stakers control 29%), where the 
distribution of these two entities is almost equal, and the 
distribution between entities controlling the other 40% of the 
market is also relatively equal, which results in a re-scaled HHI 
of 0.2, indicating only moderate concentration, while the Gini 
index gives a value of 0.91, which more accurately reflects the 
level of concentration in this area. 

While the Gini index and the HHI are each better suited 
to specific dimensions of measurement, there can often be 
nuances that they fail to capture by themselves. An illustrative 
given by Buterin is where there are two hypothetical 
societies with profound levels of inequality, one in which half 
the population equally shares all the resources while the other 
half has none, and the other in which one person has half of 
all the resources, everyone else equally shares the remaining 
half. In Buterin’s example, both distributions would result in 
the same Gini index value, despite having radically different 
characteristics. 

These limitations form the rationale behind?&xthe inclusion 
of the Shannon index within our model. The Shannon index 
is sensitive to both size of the distribution and the diversity 
of different values in the distribution, and is therefore able to 
highlight differences between distributions that the Gini index 
does not capture. 

This is meaningful in considering the level of concentration 
in areas where the distribution is smaller, for example with 
*Stablecoins by Tvl’. In the result for this metrics, the Gini 
index is very high, indicating high concentration, but the 
Shannon Index shows a relatively low concentration, and this 
is a reflection of the fact there are fewer stablecoins, and hence 
less chance for a decentralized market to emerge. 

As we can see there is broad agreement between the 
Atkinson and Shannon indices, and this is not surprising con- 
sidering they are both based on the entropy model. However, 
the Atkinson index is not used by our model except for 
certain specific dimensions of measurements, because of its 
non-standard inequality aversion parameter, which makes it 
unsuitable for comparison with other studies. It can be useful 
however, in fine tuning certain measurements when needed. 

The observations of the recorded results have been orga- 
nized into the following sections which broadly correspond to 
the categories of metrics as described in section I. 


B. Results based on original Nakamoto Coefficient subsystem 
selection 


1) Network Nodes: When examining the data recorded 
across the dimensions of measurement that are based on 
original Nakamoto Coefficient subsystem selection, we first 
applied the Gini index to the metrics pertaining to network 
nodes. The results of this analysis is displayed in figure 
which plots the Gini index of these metrics across our sample 
time period. 


We observe a relatively high Gini index value for both Ex- 90% to 91% during the sample period. This suggests a very 
ecution Nodes by Country and Consensus Nodes high level of centralization amount 


which both maintain an average Gini index of 0.85 to 0.79. 
When examining the data more closely, this was found to be 
caused by an out-sized number of nodes that are situated in the 
USA, having 34% of Consensus nodes and 44% of Execution 
nodes. Both numbers are significant in terms of having crossed 
a key consensus threshold of 33%, which is enough to cause 
disruption to liveness under adverse conditions. 

We observe that measuring Consensus Nodes by Client 
yields an average Gini index of 0.57, which is a reasonably 
safe level of client diversity, while Execution Nodes by Client 


which changes be- 


as a bug in a single client version can have consequences 
for the network, as occurred in May 2023 when a bug in a 
consensus client caused a lack of finality for a period of time 


(52). 
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Fig. 5. Gini Indices for Network Nodes 


We also refer to the HHI as measurement of client diversity, 
which may be a more more accurate measurement than the 
Gini index, due to-sijseeatasessizas the population size of the 
distribution. In both cases the HHI suggest a highly concen- 
trated distribution, and therefore a lack of healthy diversity, 
with a slight over-concentration in consensus nodes (mainly 
driven by prism and Lighthouse (53), and a pronounce over- 
concentration in execution nodes (driven by Geth). 

2) Amount Staked by Staking Pool: In examining the 


Amount Staked by Staking Pool, we observe that the level 


staking pools, however 
upon scrutinizing the-g#€*underlying data, we observe that 


a single entity, which is Lido. While this seems concerning 
at first, it is to be noted that Lido itself is a DAO, and as 
such is a decentralized entity. Lido commissions 29 separate 


a between each node 


operator, as documented on the VaNoM portal [54]. 
Rather than accounting for Lido’s decentralized quality by 
breaking down the composition of its node operators within the 


original dataset, we leverage the Atkinson index, by modifying 
the inequality aversion parameter to account for its level of 
decentralization. To do this we multiply the standard inequality 
parameter of 0.5 by a weighting derived from Lido’s market 
share, denoted by w, i.e.: 


a’ =a: (1— w) = 0.5- (1 — 0.3) = 0.35 


By using an inequality aversion parameter of a! = 0.35 
we obtain an Atkinson index of 0.6 to 0.61 which remains 
relatively static through the sample period, and which rep- 
resents a much lower level of centralization than suggested 
by applying the Gini index. The value of the Atkinson index 


indicates a moderate level of centralization, which is likely 
caused by a number of entities with larger of market shares, 
such as Coinbase, which has a share of 10%. 

3) Distribution of Native asset by Amount: We observe that 
there is an indicative level of centralization with regards to 
the distribution of the native asset, ETH. The most widely 
used index for measuring the distribution of wealth within an 
economic system is the Gini index, which gives an average 
value of 0.76 for the distribution of ETH. 

While this represents a relatively high level of centraliza- 
tion with regards to control of ETH, we must make some 
consideration to the qualities of the entities at the top end of 
the distribution, that control large amounts of the asset, which 
could include the smart contracts of DeFi dapps, bridges etc.. 


One previous study by Glassnode 6 reported that 22.8% 


actual level of centralization in terms of ETH ownership is 
considerably less than is suggested through the application of 


the Gini index. While a portion of the smart contracts that hold 
ETH are wallets of private individuals, rather than dapps, the 
relative proportions of each is a question for further study. 


C. Metrics pertaining to PBS 


In analyzing the data with regards to the metrics pertaining 
we examined both the metrics 

for Blocks proposed by Builder and the Blocks proposed by 
Relay. For these metrics we applied the Herfindahl-Hirschman 
index as a more suitable index, due to the fact that the builder 
and relay space is naturally more concentrated, and comprised 
of a number of known entities, making it similar in some 


ways to more traditional economic sectors (as opposed to 


500,000 pseudonymous validators for instance). The results 
of our analysis is visualized in figure [6] 

As can be observed from the visualization, the market 
concentration ranges from a re-scaled HHI of 0.16 (or 1,600), 
to 0.24 (or 2,400) for relays, whereas the builders display a 
HHI that ranges from 0.15 (1,500) to 0.26 (2,600). 

To refer to the description of the HHI in section II, we recall 
that the US DoJ regards any sector with a HHI of less than 
1,500 as being unconcentrated, whereas anything above 2,500 
as being highly concentrated. 

Using these broad categories as a reference, we observe 


riod. We also observe that the level of concentration changes 
with significant relevance for the above categorization, even 
within a 90 day period. 
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Fig. 6. HHI of Block Builders and Relays 


When examining the Jensen-Shannon Divergence between 


the first and last 24 hour intervals in the dataset for both 
relevant metrics, we see a relatively high value compared to all 


other metrics, which indicates that there are significant changes 


within the market. This is a useful index to cross-reference as 
it is capable of capturing nuance that other indices sometimes 
fail to capture, for example 

can frequently reverse their respective market shares, with 
of time. In this scenario, the overall level of inequality and 


participants can change radically. This is a quality that the JSD 


index can help to capture. 


Metric JSD 

Amount Staked by Pool 0.015045 1 
Execution Nodes by Country 0.0073133 
Execution Nodes by ClientBase 0.0118433 
Consensus Nodes by Country 0.0058399 
Consensus Nodes by Client 0.0077753 
Blocks by Relays 0.1324419 
Blocks by Builder 0.2201886 
Stablecoins by Tvl 0.0103413 
Rollups by Tvl 0.1058467 
Native Assets by Address 0.00058 11 


It is worth noting that over a longer time-frame, our model 
should be able to account for changes to the ecosystem from 
newer innovations and architectural changes. These include en- 
shrined PBS or the introduction of distributed block building, 
and there are a number of such proposals under development 
currently. 


D. Miscellaneous Metrics 


1) Rollups by TVL: We observe a somes level of 
centralization when we examine Rollups by TVL, with a 90 
day average Gini index value of 0.87, and a re-scaled HHI of 
0.45. Both of these values make sense when we observe the 
underlying data, in which a single rollup has 54.3% of TVL 
across all rollups, which is Arbitrum One (down from 64.5% 
at the start of the sample period). This is closely followed by 
Optimism that has a TVL of 25.9%. There are 12 rollups with 
a TVL share of less than 1%, and 7 with a TVL of more 
than 1%, the latter group exhibiting significant variance, from 
which we would expect a high HHI value. 

Interestingly, the one other metric that also show a high JSD 
value for the sample period is Rollups By TVL. This could 
be explained by the launch of the Base network on August 
9h which significantly altered the relevant market share of 
prominent rollups. 


We observe that the level of centralization in the rollup 


market at the time the data was collected, is at an unhealthy 
level for the Ethereum ecosystem. The TVL across all rollups 
is 8.1B USD, which is approximately 0.3% of the 26.562B 


USD TVL of Ethereum (56). and of which one single rollup 
accounts for 0.2% of Ethereum’s TVL. 

It is worth noting again that these figures will more than 
likely change over time as a number of zk-EVM rollups will 
enter the market, which will dilute the marketplace and will 
likely affect market concentration significantly, underpinning 
the need for ongoing measurement of decentralization over 
time. 

2) Stablecoins by TVL: We observe the highest average 
value for the Gini index in the “Stablecoins by TVL” metric, 
which also has a markedly high HHI value. This suggests a 


which is obvious when we examine the underlying data, 


that shows that just two stablecoins comprise over 80% of 
the market, i.e. USDC and USDT. As both stablecoins are 


issued by corporations, this represents a significant area of 


centralization within Ethereum. Stablecoins are used by many 
people to interact with a wide array of dapps, and both USDC 
and USDT are vulnerable to censorship, and have previously 
frozen token holders funds on request from authorities 
[58]. 

According to DefiLlama 56), the combined market cap for 
stablecoins on Ethereum is 9.2B USD, which represents a 


third of the TVL on Ethereum. This could arguably be the 
weakest point of Ethereum’s overall decentralization. 

3) Effective inflation rate adjusted for burn: As discussed 
in section III, the Effective inflation rate adjusted for burn 
is an important metric to measure, as a high inflation rate 
can undermine the security of the network by diluting supply, 
and the devaluing the asset, potentially leading to misaligned 
incentives for the validators. 

Since the introduction of EIP-1559, which introduces the 
concept of the ”base fee”, which is set deterministically 
via protocol consensus and which is burned in every block, 


Ethereum’s effective issuance rate becomes negative when the 
network usage / gas price crosses a certain threshold [59]. 
According to data from the Ultra Sound Money ee 
[60], the effective level of inflation of ETH since the merge 
has been -0.94%, and ETH is projected to continue to be a 
deflationary asset in the long term, which is positive for the 


network’s overall security and level of decentralization. 

4) Percentage of total supply staked: The percentage of 
total supply staked is related to the effective inflation rate 
for the same reason as previously outlined, in that a high 
issuance has a dilutionary effect on the circulating supply, 
which incentives asset holders to stake in order to counteract 
the dilution, resulting in a feedback cycle that can undermine 
the network’s security. Ethereum’s economics are designed to 
reduce issuance as more validators come online [26], which 


theoretically reduces the incentive to stake once the percentage 
of staked assets reaches a certain threshold. Currently the 


amount of ETH staked is 24,932,109 ETH according to bea- 
concha.in [61], out of 120,218,472 ETH total supply according 


to Etherscan [62], resulting in a ratio of 20%, which is well 
within the bounds of what is a safe level of total supply staked. 


E. Metrics pertaining to Account Abstraction 


The metrics that pertain to Account Abstraction, which are 


striking insofar as they have especially low averages for the 
Gini index but very high values for the HHI values. This 
indicates that there is an issue with the dataset that warrants 


further scrutiny. Upon investigation we observe that the 90 day 
Gini indices for User Operations by Bundler has a range of 0 


to 1, with a median of 0.6 and a standard deviation of 0.36. 
This highly unusual range and variance can be attributed to the 


Upon inspecting the underlying data, we observe that the 
maximum number of user operations processed in a single day 
during the sample period is only 301. At this nascent stage, 


It is worth noting however that this data looks at Ethereum 
mainnet itself, and does not consider any of the L2 rollups, 
abstraction infrastructure, and which is an area for future study. 


The Wallets by Deployer metric shows very similar charac- 
teristics to the User Operations by Bundler and follows much 
the same pattern of distribution throughout the sample period, 


The various metrics outlined in this paper will be useful 
in tracking the changes in decentralization over time as the 
account abstraction space matures and continues to see more 
adoption. The influence of erc-4337 bundlers on the effective 
level of decentralization of the network as a whole is somewhat 
limited, so even a moderately high level of centralization is 
acceptable. 


F Master Indices 


For every index that we track, we derive a master index that 
is a aggregate value derived from the value of each dimension 
of measurement for that index. This means that we can derive 
a single value for every day in the sample period, rather than 
a 12 separate values for each day, (i.e. one for each dimension 
of measurement). 
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Fig. 7. Master Indices of Gini Index and Herfindahl-Hirschman Index 


We are specifically interested in the master index for both 
the Gini Index and Herfindahl-Hirschman Index for each day 
in the sample period. We have sub-sampled 18 data points 


from the 90 day period and have charted them in figure Z 


a crude high-level indicator of the overall effective level of 


measurement of the network, and in theory we should be 
maintain a single relative measurement even as the dimensions 
of measure change over time. 


bte To 


the use of numerous models and indices under two broad 
categories of structural measures and performance measures. 
Another potential direction for research is to develop a 


As can be seen from the sample above, ~we~can—see—that~ way to measure potential collusion between entities within a 


the overall effective level of decentralization changes over 
time, and this is a reflection of the changes within the various 
dimensions of measurement, specifically the Gini index and 


HHI index in this case. Overall we can see a trend in a 
reduction in the value of the Gini across all dimensions of 


from 7.5% to 10%. Note that the anomalous reduction in the 
level of centralization around the July 8" to July 13" period 
are due to gaps in the source data. Also, we have excluded 
the values for metrics related to account abstraction in order 
to get a clearer representation of the overall trends in the data. 

It is important to note that the values of the master are 


not like other indices, insofar as it is not the case that values 


U 
comparison the same value at other points in time. 
VI. CONCLUSION AND DISCUSSION 

Our results clearly demonstrate that the overall Ethereum 
ecosystem displays elements of concentration of control that 
are arguably less decentralized than the community would 
aspire to. The results suggest the need for diligence in efforts 
to develop and maintain a healthy level of decentralization. 
This can seen in areas such as client diversity (in terms of 
network nodes), market share of staking pools, market share 
of rollups, and stablecoins. While stablecoins have a much 
lower TVL compared to Ethereum as a whole, this belies the 
important function stablecoins serve in the ecosystem, and thus 
presents a red flag for centralization concerns. 

As the ecosystem evolves, there will be changes in the 
important components of the ecosystem’s overall infrastructure 
beyond the core Ethereum protocol, and there will also be 
changes to the extent of their impact on the overall effect 
level of decentralization. As Barnabé Monnot states: ’There 
are things that the protocol does not see, but cares about” [63]. 
As such, we should expect our model to adapt to these changes 
over time. 

Not only will the components that we measure change over 
time, but also our model will evolve as we learn more about the 
ecosystem and how to measure it. While we have demonstrated 


that attempting to measure decentralization using a single 


While we have attempted to use a series of metrics that 
can be used to cross reference each other to paint a larger 
picture, there are still some areas that can be explored further. 
One potential area of exploration is to study the methodologies 
used in other areas that face a similar challenge. One example 
of research that can be explored is the OECD’s methodologies 
to measure market competition [64], in which they explore 


population, rather than assuming that each entity in a popu- 
lation is totally independent, as our current model assumes. 
Vitalik Buterin discussed this approach and described it 
as being done by assigning pairwise coordination coefficients 
to all distinct pairs of entities within a population. This would 
mean that any two pairwise entities within a population would 
have coefficient between 0 and 1, where 1 implies they are so 
tightly coordinated and aligned, they should be measured as a 
single actor, and a 0 implies they are completely independent. 
The properties that attribute the level of coordination and 
alignment, and resultant coordination coefficient, could be 
as simple as being the same country, or being on the same 
network etc. 

Obviously an approach such as this would require much 
more data about the individual entities in the population, and 
would present a challenge to gathering the data and ensuring 
data quality, which is an important consideration when decid- 
ing to develop a model based on that data. Nevertheless, this 
represents an interesting direction for future research. 

One important factor in the measurement of decentraliza- 
tion, is that while our models become more complex, the 
ability for them to be widely understood remains an objective 
that should not be lost. If the model itself becomes complex 
enough to preclude analysis from a diverse audience, then it 
loses its ability to act as a reference point for communication 
and discussion, which remains its primary objective. 
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VIII. APPENDIX: LIST OF DATA SOURCES 


The following table lists the various sources of data for each data point that is referenced in our model. The data from these 
sources were recorded programmatically and compiled into a database using software that was specifically designed for that 
purpose. 


Based on original Nakamoto Coefficient subsystem selection: 


https://migalabs.es/api/v 1/client-distribution 


ttps://migalabs.es/api/v 1/geo-distribution 


Consensus nodes by client 


Consensus nodes by country 


ttps://www.ethernodes.org, 
https://www.ethernodes.org/countries 


ttps://data.messari.io/api/v 1/assets/ethereum/metrics 


Execution nodes by client 


Execution nodes by country 


Distribution of native asset by amount 


Amount staked by Pool / Staking Service Provider 


Blocks proposed by builder 
Blocks proposed by relayer 


Metrics pertaining to Account Abstraction: 


Number of user operations per bundler 


Number of wallets per deployer 


Miscellaneous Metrics: 


Rollups by relative TVL / size / volume 


https://l2beat.com/scaling/tvl 


Stablecoins ttps://stablecoins. lama.fi/stablecoins 


